Individual glacier surface velocity analysis#

This notebook will build upon the data access and inspection steps in the earlier notebooks and demonstrate basic data analysis and visualization of surface velocity data at the scale of an individual glacier using xarray.

Learning goals:

  • using xarray label-based indexing and selection tools

  • computation and grouped computation

  • visualization

import os
import json
import urllib.request
import numpy as np
import xarray as xr
import rioxarray as rxr
import geopandas as gpd
import pandas as pd

import matplotlib.pyplot as plt
import matplotlib.ticker as mticker

from shapely.geometry import Polygon
from shapely.geometry import Point
import cartopy.crs as ccrs
from cartopy.mpl.gridliner import LONGITUDE_FORMATTER, LATITUDE_FORMATTER
import cartopy
import cartopy.feature as cfeature

import flox

%config InlineBackend.figure_format='retina'
import itslivetools
with urllib.request.urlopen('https://its-live-data.s3.amazonaws.com/datacubes/catalog_v02.json') as url_catalog:
    itslive_catalog = json.loads(url_catalog.read().decode())
itslive_catalog.keys()
dict_keys(['type', 'features'])
url = itslivetools.find_granule_by_point(itslive_catalog, [95.180191, 30.645973])
url
['http://its-live-data.s3.amazonaws.com/datacubes/v02/N30E090/ITS_LIVE_vel_EPSG32646_G0120_X750000_Y3350000.zarr']
dc = itslivetools.read_in_s3(url[0])
dc
<xarray.Dataset>
Dimensions:                    (mid_date: 6554, y: 833, x: 833)
Coordinates:
  * mid_date                   (mid_date) datetime64[ns] 2021-05-13T04:30:01....
  * x                          (x) float64 7.001e+05 7.003e+05 ... 8e+05
  * y                          (y) float64 3.4e+06 3.4e+06 ... 3.3e+06 3.3e+06
Data variables: (12/54)
    acquisition_date_img1      (mid_date) datetime64[ns] dask.array<chunksize=(6554,), meta=np.ndarray>
    acquisition_date_img2      (mid_date) datetime64[ns] dask.array<chunksize=(6554,), meta=np.ndarray>
    autoRIFT_software_version  (mid_date) <U5 dask.array<chunksize=(6554,), meta=np.ndarray>
    chip_size_height           (mid_date, y, x) float32 dask.array<chunksize=(6554, 70, 70), meta=np.ndarray>
    chip_size_width            (mid_date, y, x) float32 dask.array<chunksize=(6554, 70, 70), meta=np.ndarray>
    date_center                (mid_date) datetime64[ns] dask.array<chunksize=(6554,), meta=np.ndarray>
    ...                         ...
    vy_error_mask              (mid_date) float64 dask.array<chunksize=(6554,), meta=np.ndarray>
    vy_error_modeled           (mid_date) float64 dask.array<chunksize=(6554,), meta=np.ndarray>
    vy_error_slow              (mid_date) float64 dask.array<chunksize=(6554,), meta=np.ndarray>
    vy_stable_shift            (mid_date) float64 dask.array<chunksize=(6554,), meta=np.ndarray>
    vy_stable_shift_mask       (mid_date) float64 dask.array<chunksize=(6554,), meta=np.ndarray>
    vy_stable_shift_slow       (mid_date) float64 dask.array<chunksize=(6554,), meta=np.ndarray>
Attributes: (12/18)
    GDAL_AREA_OR_POINT:         Area
    author:                     ITS_LIVE, a NASA MEaSUREs project (its-live.j...
    autoRIFT_parameter_file:    http://its-live-data.s3.amazonaws.com/autorif...
    datacube_software_version:  1.0
    date_created:               09-Jun-2022 09:28:39
    date_updated:               09-Jun-2022 09:28:39
    ...                         ...
    s3:                         s3://its-live-data/datacubes/v02/N30E090/ITS_...
    skipped_granules:           s3://its-live-data/datacubes/v02/N30E090/ITS_...
    time_standard_img1:         UTC
    time_standard_img2:         UTC
    title:                      ITS_LIVE datacube of image_pair velocities
    url:                        https://its-live-data.s3.amazonaws.com/datacu...
dc_timesorted = dc.sortby(dc['mid_date'])

Read in vector data#

#se_asia = gpd.read_file('https://github.com/e-marshall/itslive/raw/master/rgi15_southasiaeast.geojson')
se_asia = gpd.read_file('/home/emmamarshall/Desktop/data/rgi/south_asia_east_15/15_rgi60_SouthAsiaEast.shp')
se_asia_prj = se_asia.to_crs('EPSG:32645')
se_asia_prj.explore()
Make this Notebook Trusted to load map: File -> Trust Notebook

Choose a single glacier and subset

sample_glacier_vec = se_asia_prj.loc[se_asia_prj['RGIId'] == 'RGI60-15.00639']
sample_glacier_vec
RGIId GLIMSId BgnDate EndDate CenLon CenLat O1Region O2Region Area Zmin ... Aspect Lmax Status Connect Form TermType Surging Linkages Name geometry
6 RGI60-15.00639 G095180E30646N 19990923 -9999999 95.180191 30.645973 15 3 8.688 4744 ... 38 5149 0 0 0 0 9 9 NaN POLYGON ((1286136.469 3418173.952, 1286105.304...

1 rows × 23 columns

type(sample_glacier_vec.geometry.values[0])
shapely.geometry.polygon.Polygon

Clip ITS_LIVE data to extent of sample glacier#

First, need to write the crs attr of the datacube

dc_timesorted = dc_timesorted.rio.write_crs(f"epsg:{dc_timesorted.attrs['projection']}", inplace=True)
dc_timesorted.rio.clip(sample_glacier_vec.geometry, sample_glacier_vec.crs)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[12], line 1
----> 1 dc_timesorted.rio.clip(sample_glacier_vec.geometry, sample_glacier_vec.crs)

File ~/miniconda3/envs/itslive_tutorial/lib/python3.11/site-packages/rioxarray/raster_dataset.py:383, in RasterDataset.clip(self, geometries, crs, all_touched, drop, invert, from_disk)
    378 try:
    379     x_dim, y_dim = _get_spatial_dims(self._obj, var)
    380     clipped_dataset[var] = (
    381         self._obj[var]
    382         .rio.set_spatial_dims(x_dim=x_dim, y_dim=y_dim, inplace=True)
--> 383         .rio.clip(
    384             geometries,
    385             crs=crs,
    386             all_touched=all_touched,
    387             drop=drop,
    388             invert=invert,
    389             from_disk=from_disk,
    390         )
    391     )
    392 except MissingSpatialDimensionError:
    393     if len(self._obj[var].dims) >= 2 and not get_option(
    394         SKIP_MISSING_SPATIAL_DIMS
    395     ):

File ~/miniconda3/envs/itslive_tutorial/lib/python3.11/site-packages/rioxarray/raster_array.py:894, in RasterArray.clip(self, geometries, crs, all_touched, drop, invert, from_disk)
    892 crs = crs_from_user_input(crs) if crs is not None else self.crs
    893 if self.crs != crs:
--> 894     geometries = rasterio.warp.transform_geom(crs, self.crs, geometries)
    895 cropped_ds = None
    896 if from_disk:

File ~/miniconda3/envs/itslive_tutorial/lib/python3.11/site-packages/rasterio/env.py:401, in ensure_env.<locals>.wrapper(*args, **kwds)
    399 else:
    400     with Env.from_defaults():
--> 401         return f(*args, **kwds)

File ~/miniconda3/envs/itslive_tutorial/lib/python3.11/site-packages/rasterio/env.py:630, in require_gdal_version.<locals>.decorator.<locals>.wrapper(*args, **kwds)
    624         elif full_kwds[param] in values:
    625             raise GDALVersionError(
    626                 'parameter "{0}={1}" requires '
    627                 'GDAL {2} {3}{4}'.format(
    628                     param, full_kwds[param], inequality, version, reason))
--> 630 return f(*args, **kwds)

File ~/miniconda3/envs/itslive_tutorial/lib/python3.11/site-packages/rasterio/warp.py:104, in transform_geom(src_crs, dst_crs, geom, antimeridian_cutting, antimeridian_offset, precision)
     63 @ensure_env
     64 @require_gdal_version('2.1', param='antimeridian_cutting', values=[False],
     65                       is_max_version=True,
   (...)
     73         antimeridian_offset=10.0,
     74         precision=-1):
     75     """Transform geometry from source coordinate reference system into target.
     76 
     77     Parameters
   (...)
    101         Transformed geometry(s) in GeoJSON dict format
    102     """
--> 104     return _transform_geom(
    105         src_crs,
    106         dst_crs,
    107         geom,
    108         antimeridian_cutting,
    109         antimeridian_offset,
    110         precision)

File rasterio/_warp.pyx:114, in rasterio._warp._transform_geom()

File rasterio/_warp.pyx:71, in rasterio._warp._transform_single_geom()

File rasterio/_features.pyx:693, in rasterio._features.OGRGeomBuilder.build()

ValueError: Unsupported geometry type FeatureCollection
sample_glacier_raster = dc_timesorted.rio.clip(sample_glacier_vec.geometry, sample_glacier_vec.crs)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[13], line 1
----> 1 sample_glacier_raster = dc_timesorted.rio.clip(sample_glacier_vec.geometry, sample_glacier_vec.crs)

File ~/miniconda3/envs/itslive_tutorial/lib/python3.11/site-packages/rioxarray/raster_dataset.py:383, in RasterDataset.clip(self, geometries, crs, all_touched, drop, invert, from_disk)
    378 try:
    379     x_dim, y_dim = _get_spatial_dims(self._obj, var)
    380     clipped_dataset[var] = (
    381         self._obj[var]
    382         .rio.set_spatial_dims(x_dim=x_dim, y_dim=y_dim, inplace=True)
--> 383         .rio.clip(
    384             geometries,
    385             crs=crs,
    386             all_touched=all_touched,
    387             drop=drop,
    388             invert=invert,
    389             from_disk=from_disk,
    390         )
    391     )
    392 except MissingSpatialDimensionError:
    393     if len(self._obj[var].dims) >= 2 and not get_option(
    394         SKIP_MISSING_SPATIAL_DIMS
    395     ):

File ~/miniconda3/envs/itslive_tutorial/lib/python3.11/site-packages/rioxarray/raster_array.py:894, in RasterArray.clip(self, geometries, crs, all_touched, drop, invert, from_disk)
    892 crs = crs_from_user_input(crs) if crs is not None else self.crs
    893 if self.crs != crs:
--> 894     geometries = rasterio.warp.transform_geom(crs, self.crs, geometries)
    895 cropped_ds = None
    896 if from_disk:

File ~/miniconda3/envs/itslive_tutorial/lib/python3.11/site-packages/rasterio/env.py:401, in ensure_env.<locals>.wrapper(*args, **kwds)
    399 else:
    400     with Env.from_defaults():
--> 401         return f(*args, **kwds)

File ~/miniconda3/envs/itslive_tutorial/lib/python3.11/site-packages/rasterio/env.py:630, in require_gdal_version.<locals>.decorator.<locals>.wrapper(*args, **kwds)
    624         elif full_kwds[param] in values:
    625             raise GDALVersionError(
    626                 'parameter "{0}={1}" requires '
    627                 'GDAL {2} {3}{4}'.format(
    628                     param, full_kwds[param], inequality, version, reason))
--> 630 return f(*args, **kwds)

File ~/miniconda3/envs/itslive_tutorial/lib/python3.11/site-packages/rasterio/warp.py:104, in transform_geom(src_crs, dst_crs, geom, antimeridian_cutting, antimeridian_offset, precision)
     63 @ensure_env
     64 @require_gdal_version('2.1', param='antimeridian_cutting', values=[False],
     65                       is_max_version=True,
   (...)
     73         antimeridian_offset=10.0,
     74         precision=-1):
     75     """Transform geometry from source coordinate reference system into target.
     76 
     77     Parameters
   (...)
    101         Transformed geometry(s) in GeoJSON dict format
    102     """
--> 104     return _transform_geom(
    105         src_crs,
    106         dst_crs,
    107         geom,
    108         antimeridian_cutting,
    109         antimeridian_offset,
    110         precision)

File rasterio/_warp.pyx:114, in rasterio._warp._transform_geom()

File rasterio/_warp.pyx:71, in rasterio._warp._transform_single_geom()

File rasterio/_features.pyx:693, in rasterio._features.OGRGeomBuilder.build()

ValueError: Unsupported geometry type FeatureCollection
sample_glacier_raster
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[14], line 1
----> 1 sample_glacier_raster

NameError: name 'sample_glacier_raster' is not defined

Taking a look at a velocity time series#

Now that we have the velocity data clipped to a single glacier, let’s explore the clipped dataset. The below cell plots the mean velocity across the x and y dimensions over time.

sample_glacier_raster.v.mean(dim=['x','y']).plot(linewidth=0, marker='o', alpha = 0.3);
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[15], line 1
----> 1 sample_glacier_raster.v.mean(dim=['x','y']).plot(linewidth=0, marker='o', alpha = 0.3);

NameError: name 'sample_glacier_raster' is not defined

It looks like there is a large amount of variability in the mean velocity over time. Let’s use xarray tools to resample the time dimension.

resample_obj = sample_glacier_raster.resample(mid_date = '1M')
resample_obj
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[16], line 1
----> 1 resample_obj = sample_glacier_raster.resample(mid_date = '1M')
      2 resample_obj

NameError: name 'sample_glacier_raster' is not defined

.resample() is another grouping operation and returns an object of type xarray.core.resample.DatasetResample

sample_glacier_resample_1mo = resample_obj.mean(dim='mid_date')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[17], line 1
----> 1 sample_glacier_resample_1mo = resample_obj.mean(dim='mid_date')

NameError: name 'resample_obj' is not defined

The below plot is the initial velocity time series in blue, and the velocity data resampled to 1 month intervals in orange

sample_glacier_raster.v.mean(dim=['x','y']).plot(label = 'Mean glacier speed')
sample_glacier_resample_1mo.v.mean(dim=['x','y']).plot(label = '1 month resample')
plt.legend();
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[18], line 1
----> 1 sample_glacier_raster.v.mean(dim=['x','y']).plot(label = 'Mean glacier speed')
      2 sample_glacier_resample_1mo.v.mean(dim=['x','y']).plot(label = '1 month resample')
      3 plt.legend();

NameError: name 'sample_glacier_raster' is not defined

This is interesting! Despite what looks to be a pretty noisy signal looking at the full time series, we can start to pick out a seasonal signal and sub-annual velocity variability looking at the velocity data resampled into 1-month bins.

We could also calculate velocity anomalies…#

To do this, we will use xarray .groupby() and .map()

following example from xarray tutorial

We first define a function that subtracts the long-term mean from a single observation.

def remove_time_mean(x):
    return x-x.mean(dim='mid_date')

We then group the dataset by month and apply the function to calculate the anomaly on each group

sample_glacier_anom = sample_glacier_raster.groupby('mid_date.month').map(remove_time_mean)
sample_glacier_anom
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[20], line 1
----> 1 sample_glacier_anom = sample_glacier_raster.groupby('mid_date.month').map(remove_time_mean)
      2 sample_glacier_anom

NameError: name 'sample_glacier_raster' is not defined

Let’s observe the velocity anomaly alongside the velocity time series.

fig, axs = plt.subplots(ncols = 2, figsize=(17,7))
sample_glacier_anom.v.mean(dim=['x','y']).plot(ax=axs[1])
sample_glacier_raster.v.mean(dim=['x','y']).plot(ax=axs[0])
axs[1].axhline(y=0, c = 'red', alpha = 0.5)
axs[0].set_title('Glacier mean magnitude of velocity (m/y) over time series')
axs[1].set_title('Glacier mean velocity anomaly (m/y) over time series');
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[21], line 2
      1 fig, axs = plt.subplots(ncols = 2, figsize=(17,7))
----> 2 sample_glacier_anom.v.mean(dim=['x','y']).plot(ax=axs[1])
      3 sample_glacier_raster.v.mean(dim=['x','y']).plot(ax=axs[0])
      4 axs[1].axhline(y=0, c = 'red', alpha = 0.5)

NameError: name 'sample_glacier_anom' is not defined
_images/ac8bcd217d3733a7eb3570c220a03e8ba0264d62da826096e082353e225b8b49.png

In the above plot we were taking the mean over the x and y dimensions. Let’s take the mean along the mid_date dimension:

fig, axs = plt.subplots(ncols =2 , figsize=(16,7))
sample_glacier_raster.mean(dim='mid_date').v.plot(ax = axs[0])
sample_glacier_anom.mean(dim='mid_date').v.plot(ax=axs[1]);
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[22], line 2
      1 fig, axs = plt.subplots(ncols =2 , figsize=(16,7))
----> 2 sample_glacier_raster.mean(dim='mid_date').v.plot(ax = axs[0])
      3 sample_glacier_anom.mean(dim='mid_date').v.plot(ax=axs[1]);

NameError: name 'sample_glacier_raster' is not defined
_images/c5a07acfb9a8583ea9dc9620fb3d80197714103d571df31d4f3dfc227776009e.png

Grouped analysis by season#

We have a dense time series of surface velocity data for a single glacier. We can use xarray’s .groupby() to examine velocity variability further. We will start with using .groupby() to break the velocity time series into seasonal means.

seasons_gb = sample_glacier_raster.groupby(sample_glacier_raster.mid_date.dt.season).mean()
#add attrs to gb object
seasons_gb.attrs = sample_glacier_raster.attrs 
seasons_gb
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[23], line 1
----> 1 seasons_gb = sample_glacier_raster.groupby(sample_glacier_raster.mid_date.dt.season).mean()
      2 #add attrs to gb object
      3 seasons_gb.attrs = sample_glacier_raster.attrs 

NameError: name 'sample_glacier_raster' is not defined

Breaking down the above cell, we defined how we wanted to group our data (sample_glacier_raster.mid_date.dt.season) and the reduction we wanted to apply to each group (mean()). After the apply step, xarray automatically combines the groups into a single object. We can see that the seasons_gb object is an xarray.Dataset with the same dimensions and coordinates as the sample_glacier_raster object but that the seasons_gb object has a seasons dimension as well.

If you’d like to see another example of this with more detailed explanations, go here.

To visualize velocity data across the seasonal groups we just defined, we can use xarray’s faceting functionality. Faceting is a great way to visualize your data in ‘small multiples’ format.

fg = seasons_gb.v.plot(
    col='season',
);
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[24], line 1
----> 1 fg = seasons_gb.v.plot(
      2     col='season',
      3 );

NameError: name 'seasons_gb' is not defined

Extracting and visualizing data at a single point#

We can use xarray’s .sel() to extract velocity data at a single point or within a subset along given dimensions. In this example, we use .sel() to compare the magnitude of velocity of ice flow at a point in the glacier’s accumulation zone to the mean magnitude of velocit of the entire glacier.

fig, axs = plt.subplots(ncols =3, figsize=(20,5))
sample_glacier_raster.v.sel(x = 246052.5, y= 3181987.5).plot(ax=axs[0]);
sample_glacier_raster.v.mean(dim=['x','y']).plot(ax=axs[0], alpha = 0.5);
sample_glacier_raster.v.mean(dim='mid_date').plot(ax=axs[1]);
axs[1].axvline(x=246052.5, c= 'red')
axs[1].axhline(y=3181987.5, c='red')
(sample_glacier_raster.v.sel(x = 246052.5, y= 3181987.5) - sample_glacier_raster.v.mean(dim=['x','y'])).plot(ax=axs[2], linewidth=0, marker='o', alpha = 0.5)
axs[0].set_title('Time series of average glacier speed (orange) \n and speed at point in accumulation zone (blue')
axs[2].set_title('Point speed - mean glacier speed')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[25], line 2
      1 fig, axs = plt.subplots(ncols =3, figsize=(20,5))
----> 2 sample_glacier_raster.v.sel(x = 246052.5, y= 3181987.5).plot(ax=axs[0]);
      3 sample_glacier_raster.v.mean(dim=['x','y']).plot(ax=axs[0], alpha = 0.5);
      4 sample_glacier_raster.v.mean(dim='mid_date').plot(ax=axs[1]);

NameError: name 'sample_glacier_raster' is not defined
_images/574bf7752d89b49a44f47b278a863badc641dcef791fcd47a54d312790eeb9de.png